Text Search Example

Example illustrating using a text search for metadata.

Step1: Create a variable for holding text search value.

text_search_value = "carbon";

Step 2: Set the number of results to return per provider.

number_of_results = 100;

Step 3: Set type of search to harvest top n results per provider

search_type = "textindex/searchset";

Step 4: Set the request domain URL for the geodex web service.

% Set core domain name for request URL
domain_url = "http://geodex.org/api/v1/";
% Create request URL
request_url = domain_url + search_type;

Step 5: Create a cell array to hold URL submission parameters.

params = {'q', text_search_value, 's', '0','n', num2str(number_of_results)};

Step 6: Make call to the geodex RESTful web service using webread.

options = weboptions('ContentType','json','Timeout',15);
results = webread(request_url,params{:},options);

Step 7: Explore results.

first_result = results(1);
index = first_result.index;
high_score = first_result.highscore;
result_set = first_result.or;
top_result = result_set(1);
top_result_url = top_result.URL;
top_result_score = top_result.score;
output = string( ...
"Provider Index = " + index + newline + ...
"Number of Results = " + string(length(result_set)) + newline + ...
"High Score form Provider = " + string(high_score) + newline + ...
"URL of Top Result for Provider = " + top_result_url + newline + ...
"Score of Top Result = " + string(top_result_score));
fprintf('%s\n', output)
Provider Index = magic
Number of Results = 1
High Score form Provider = 0.93
URL of Top Result for Provider = https://earthref.org/MagIC/doi/10.1038/NGEO1427
Score of Top Result = 0.93

Step 8: Now set the type of search to harvest top n results across all providers.

search_type = "textindex/search";

Step 9: Set the new request domain URL for the geodex web service.

Please see http://geodex.org/swagger-ui/ for a complete description of the web service call formats.
% Create new request URL
request_url = domain_url + search_type;

Step 10: Make call to the geodex RESTful web service

Use webread to make the RESTful web service call.
results = webread(request_url,params{:},options);

Step 11: Examine the results

number_of_results = length(results);
urls = string({results.URL});
firstURL = urls(1);
first_result = results(1);
firstScore = first_result.score;
output = string( ...
"Number of Results = " + string(number_of_results) + newline + ...
"URL of First Result = " + firstURL + newline + ...
"Score of First Result = " + string(firstScore));
fprintf('%s\n', output)
Number of Results = 100
URL of First Result = https://earthref.org/MagIC/doi/10.1038/NGEO1427
Score of First Result = 0.93

Step 12: Now use the RESTful web service to query for metadata about a dataset.

search_type = "graph/details";
request_url = domain_url + search_type;
dataURL = urls(contains(urls,'iedadata'));
dataURL = dataURL(1);
params = {'r', dataURL};
results = webread(request_url,params{:},options);

Step 13: Now examine the returned metadata.

output = string( ...
"Dataset Webpage URL = " + results.URL + newline + ...
"Name = " + results.Name + newline + ...
"Alternate Name = " + results.Aname + newline + ...
"URI = " + results.S + newline + ...
"Keywords = " + results.Keywords + newline + ...
"Description = " + results.Description + newline + ...
"Citation = " + results.Citation + newline + ...
"Date Published = " + results.Datepublished + newline + ...
"License = " + results.License + newline + ...
"Dataset Download Link = " + results.Curl);
fprintf('%s\n', output)
Dataset Webpage URL = http://get.iedadata.org/doi/100537
Name = Organic Carbon, in situ Temperature, Age Data, and Age Models in Scientific Ocean Drilling Holes (Deep Sea Drilling Project, Ocean Drilling Program, and Integrated Ocean Drilling Program)
Alternate Name =
URI = DOI:10.1594/IEDA/100537
Keywords = Global
Description = Abstract: Measurements of particulate organic carbon (POC), in situ sediment temperature, age data, and age models in scientific ocean drilling holes (Deep Sea Drilling Project, Ocean Drilling Program, and Integrated Ocean Drilling Program). The data are from drill holes in global areas of high POC deposition (continental margins and upwelling areas). The variation in POC content with temperature and age in the sediment column is used to study the process of organic matter decomposition.; Other Description: Malinverno, A. and Martinez, E. A. The effect of temperature on organic carbon degradation in marine sediments. Sci. Rep. 5, 17861; doi: 10.1038/srep17861 (2015).
Citation = Alberto Malinverno, Ernesto A. Martinez (2015), Organic Carbon, in situ Temperature, Age Data, and Age Models in Scientific Ocean Drilling Holes (Deep Sea Drilling Project, Ocean Drilling Program, and Integrated Ocean Drilling Program). Interdisciplinary Earth Data Alliance (IEDA). doi:10.1594/IEDA/100537
Date Published = 2015
License = Creative Commons Attribution-NonCommercial-Share Alike 3.0 United States [CC BY-NC-SA 3.0]
Dataset Download Link =

Step 14: Examine Metadata from opencoredata.org

Use the first URL from opencoredata.org
metadataURL = urls(contains(urls,'opencoredata.org'));
metadataURL = metadataURL(1);
params = {'r', metadataURL};
results = webread(request_url,params{:},options)
results = struct with fields:
S: 'http://opencoredata.org/id/dataset/003d174a-f203-4e72-b974-87ffc8b3a4d1'
Aname: ''
Name: '158_957I_JanusChemCarb_xhCoElXc.csv'
URL: 'http://opencoredata.org/id/dataset/003d174a-f203-4e72-b974-87ffc8b3a4d1'
Description: 'Janus Chem Carb for ocean drilling expedition 158 site 957 hole I'
Citation: ''
Datepublished: ''
Curl: 'http://opencoredata.org/api/v1/documents/download/158_957I_JanusChemCarb_xhCoElXc.csv'
Keywords: 'Leg Site Hole Core Core_type Section_number Section_type Top_cm Bot_cm Depth_mbsf Inor_c_wt_pct Caco3_wt_pct Tot_c_wt_pct Org_c_wt_pct Nit_wt_pct Sul_wt_pct H_wt_pct Janus Chem Carb DSDP, OPD, IODP, JanusChemCarb'
License: 'https://creativecommons.org/publicdomain/zero/1.0/'

Step 15: Examine Data from opencoredata.org

t = webread(results.Curl,weboptions('ContentReader',@(x)readtable(x)))
t = 1×17 table
 LegSiteHoleCoreCore_typeSection_numberSection_typeTop_cmBot_cmDepth_mbsfInor_c_wt_pctCaco3_wt_pctTot_c_wt_pctOrg_c_wt_pctNit_wt_pctSul_wt_pctH_wt_pct
1158957'I'1'N'1'S'33419NaN41NaNNaNNaN49''

Step 16: Examine Metadata from opentopo.sdsc.edu

metadataURL = urls(contains(urls,'opentopo.sdsc.edu'));
metadataURL = metadataURL(1);
params = {'r', metadataURL};
results = webread(request_url,params{:},options)
results = struct with fields:
S: 't3394382'
Aname: 'SONOMA_LIDAR'
Name: 'UMD-NASA Carbon Mapping /Sonoma County Vegetation Mapping and Lidar Program'
URL: 'http://opentopo.sdsc.edu/lidarDataset?opentopoID=OTSDEM.092014.2871.1'
Description: '<p>This survey covers all of Sonoma County as well as two small portions of southern Mendocino County. Data were provided by the University of Maryland and the Sonoma County Vegetation Mapping and Lidar Program under grant NNX13AP69G from NASA's Carbon Monitoring System (Dr. Ralph Dubayah, PI). The Sonoma County Vegetation Mapping and Lidar Program is NASA/UMD's local partner in this project- its members are as follows: the Sonoma County Agricultural Preservation and Open Space District, the Sonoma County Water Agency, the California Department of Fish and Wildlife, the United States Geological Survey, the Sonoma County Information Systems Department, the Sonoma County Transportation and Public Works Department, the Nature Conservancy, and the City of Petaluma.</p><p>These data will be used for various research projects throughout Sonoma County. They are critical to assessing climate mitigation and adaptation strategies and benefits provided by the landscape, such as the amount of carbon sequestration in forests or the degree to which riparian areas, floodplains, and coastal habitats may buffer extreme weather events. Other research applications include groundwater, ecosystem services valuation, ecosystem resiliency, and wildlife habitat connectivity. Finally, these data sets are key to facilitating good planning and management for watershed protection, flood control, fire and fuels management and wildlife habitat conservation.</p><p>The data was collected between September 28 and November 26, 2013 by Watershed Sciences, Inc. (WSI). Lidar was collected at high density greater than 8 pulses per square meter. WSI used two airplanes, one carrying a Leica ALS50 and the other a Leica ALS70; systems were flown at 900 meters above ground level, capturing a scan angle of 15 degrees from nadir (30 degree field of view). 4-band, 6-inch resolution aerial photography was collected simultaneously with the Lidar data.</p>'
Citation: 'Lidar data and orthophotography were provided by the University of Maryland under grant NNX13AP69G from NASA's Carbon Monitoring System (Dr. Ralph Dubayah and Dr. George Hurtt, Principal Investigators). This grant also funded the creation of derived forest cover and land cover information, including a countywide biomass and carbon map, a canopy cover map, and DEMs. The Sonoma County Vegetation Mapping and Lidar Program funded Lidar derived products in the California State Plane Coordinate System, such as DEMs, hillshades, building footprints, 1-foot contours, and other derived layers. The entirety of this data is freely licensed for unrestricted public use, unless otherwise noted. Any use of these data, including value-added products, within reports, papers, and presentations must acknowledge NASA Grant NNX13AP69G, the University of Maryland, and the Sonoma Vegetation Mapping and Lidar Program as their sources.<br>https://doi.org/10.5069/G9G73BM1'
Datepublished: ''
Curl: ''
Keywords: 'Sonoma County, Lidar, Vegetation and Habitat Mapping, Carbon Mapping, Biomass Mapping, Riparian Mapping'
License: 'http://www.opentopography.org/usageterms'

Step 17: Examine Data from opentopo.sdsc.edu

The data were downloaded using the web interface at http://opentopo.sdsc.edu/lidarDataset?opentopoID=OTSDEM.092014.2871.1
It requires you to fill out your name and email address and select a region. The Lidar data are returned in a GeoTIFF file.
The Lidar data were provided by he University of Maryland and the Sonoma Vegetation Mapping and Lidar Program under NASA Grant NNX13AP69G.
We use functions from Mapping Toolbox to read the GeoTIFF file, display as a texture map, and display the boundaries on the webmap.
The file is named output_be.tif. The be means "Bare Earth" which is one of the selections in the web form.
filename = 'output_be.tif';
[A,R] = geotiffread(filename);
figure
mapshow(A,R,'DisplayType','texturemap')
demcmap(A)

Step 18: Display Corners on webmap

You can see indeed that the GeoTIFF image is in Santa Rosa.
info = geotiffinfo(filename)
info = struct with fields:
Filename: '/mathworks/public/Kelly_Luetkemeyer/GitHub/p418Noteboks/output_be.tif'
FileModDate: '31-May-2018 11:45:32'
FileSize: 231584457
Format: 'tif'
FormatVersion: []
Height: 8409
Width: 6883
BitDepth: 32
ColorType: 'grayscale'
ModelType: 'ModelTypeProjected'
PCS: ''
Projection: ''
MapSys: ''
Zone: []
CTProjection: 'CT_LambertConfConic_2SP'
ProjParm: [7×1 double]
ProjParmId: {7×1 cell}
GCS: 'Unknown datum based upon the GRS 1980 ellipsoid'
Datum: 'Not specified (based on GRS 1980 ellipsoid)'
Ellipsoid: 'GRS 1980'
SemiMajor: 6378137
SemiMinor: 6.3568e+06
PM: 'Greenwich'
PMLongToGreenwich: 0
UOMLength: 'US survey foot'
UOMLengthInMeters: 0.3048
UOMAngle: 'degree'
UOMAngleInDegrees: 1
TiePoints: [1×1 struct]
PixelScale: [3×1 double]
SpatialRef: [1×1 map.rasterref.MapCellsReference]
RefMatrix: [3×2 double]
BoundingBox: [2×2 double]
CornerCoords: [1×1 struct]
GeoTIFFCodes: [1×1 struct]
GeoTIFFTags: [1×1 struct]
x = info.CornerCoords.X;
y = info.CornerCoords.Y;
mstruct = geotiff2mstruct(info);
mstruct.geoid = referenceEllipsoid('wgs84','feet');
[lat,lon] = minvtran(mstruct,x,y)
lat = 1×4
38.479 38.479 38.41 38.41
lon = 1×4
-122.74 -122.67 -122.67 -122.74
lat(end+1) = lat(1);
lon(end+1) = lon(1);
webmap('usgstopo')
wmline(lat,lon)
Here is a snapshot of the webmap:
figure
imshow webmap_Santa_Rosa_USGS_Topo.png

Step 19: Obtain all Metadata from opentopo.sdsc.edu and create Word Cloud

This step reads all the metadata from oppentopo.sdsc.edu and creates a word cloud of the citations.
metadataURLs = urls(contains(urls,'opentopo.sdsc.edu'));
citations = "";
for k = 1:length(metadataURLs)
metadataURL = metadataURLs(k);
fprintf('Processing: %s\n', metadataURL)
params = {'r', metadataURL};
results = webread(request_url,params{:},options);
citations(end+1) = results.Citation;
end
Processing: http://opentopo.sdsc.edu/lidarDataset?opentopoID=OTSDEM.092014.2871.1
Processing: http://opentopo.sdsc.edu/lidarDataset?opentopoID=OTLAS.092014.2871.1
Processing: http://opentopo.sdsc.edu/lidarDataset?opentopoID=OTLAS.012012.26919.2
Processing: http://opentopo.sdsc.edu/lidarDataset?opentopoID=OTSDEM.032012.26916.1
Processing: http://opentopo.sdsc.edu/lidarDataset?opentopoID=OTLAS.032012.26916.1
figure
wordcloud(citations);